Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dashboard] Remove gpustats dependencies from Ray[default] #41044

Merged
merged 9 commits into from
Nov 22, 2023

Conversation

jonathan-anyscale
Copy link
Contributor

@jonathan-anyscale jonathan-anyscale commented Nov 9, 2023

Why are these changes needed?

Add method to get gpu utilization similarly on how gpustats did, and remove gpustats from ray[default] dependencies.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Copy link
Collaborator

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg. Could you update the PR description?

dashboard/client/src/pages/node/index.tsx Outdated Show resolved Hide resolved
dashboard/modules/reporter/reporter_agent.py Outdated Show resolved Hide resolved
dashboard/modules/reporter/reporter_agent.py Outdated Show resolved Hide resolved
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
@jjyao jjyao requested a review from wookayin November 15, 2023 16:51
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
@jjyao
Copy link
Collaborator

jjyao commented Nov 16, 2023

@ericl needs your approval here as code owner.

Context: we are removing GPUtil and gpustats dependencies with a vendored in pyvnml single file library (the offical nvidia library that other libraries use internally) so that it works with minimal ray and it also removes unnecessary transitive dependencies included by gpustats for terminal display.

@jjyao jjyao requested a review from wookayin November 16, 2023 00:59
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Comment on lines 441 to 450
memory_used=int(pynvml.nvmlDeviceGetMemoryInfo(gpu_handle).used)
// MB,
memory_total=int(pynvml.nvmlDeviceGetMemoryInfo(gpu_handle).total)
// MB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could merge these two nvmlDeviceGetMemoryInfo calls into one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, in NVIDIA driver 510.39.01, a v2 memory info API was added:

https://github.com/NVIDIA/nvidia-settings/blob/510.39.01/src/nvml.h#L218-L241

The unversioned API (v1) and v2 API will return different results on R510+ drivers.

Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
# nvdia-ml-py version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment saying why we pick this version: something like we are using this version because it uses v2 api and supports a wider range of drivers.

Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always great to see deps removed

@jjyao jjyao merged commit 9b9fb55 into ray-project:master Nov 22, 2023
2 checks passed
can-anyscale added a commit that referenced this pull request Nov 24, 2023
…41044)" (#41375)

Reverts #41044

premerge is busted and potentially blocking people from merging into branch cut. Revert to unblock

Failing test: linux://python/ray/tests:test_streaming_generator_2

This reverts commit 9b9fb55.
@jonathan-anyscale jonathan-anyscale deleted the remove_gpustats branch November 27, 2023 17:29
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Nov 29, 2023
…ct#41044)

Add method to get gpu utilization similarly on how gpustats did, and remove gpustats from ray[default] dependencies.

Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Nov 29, 2023
…ay-project#41044)" (ray-project#41375)

Reverts ray-project#41044

premerge is busted and potentially blocking people from merging into branch cut. Revert to unblock

Failing test: linux://python/ray/tests:test_streaming_generator_2

This reverts commit 9b9fb55.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants